home *** CD-ROM | disk | FTP | other *** search
Text File | 1994-07-26 | 69.2 KB | 1,452 lines |
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- NNNNAAAAMMMMEEEE
- gawk - pattern scanning and processing language
-
- SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
- ggggaaaawwwwkkkk [ POSIX or GNU style options ] ----ffff _p_r_o_g_r_a_m-_f_i_l_e [ ---- ---- ]
- file ...
- ggggaaaawwwwkkkk [ POSIX or GNU style options ] [ -------- ] _p_r_o_g_r_a_m-_t_e_x_t file
- ...
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- _G_a_w_k is the GNU Project's implementation of the AWK program-
- ming language. It conforms to the definition of the
- language in the POSIX 1003.2 Command Language And Utilities
- Standard. This version in turn is based on the description
- in _T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, by Aho, Kernighan, and
- Weinberger, with the additional features defined in the Sys-
- tem V Release 4 version of UNIX _a_w_k. _G_a_w_k also provides
- some GNU-specific extensions.
-
- The command line consists of options to _g_a_w_k itself, the AWK
- program text (if not supplied via the ----ffff or --------ffffiiiilllleeee options),
- and values to be made available in the AAAARRRRGGGGCCCC and AAAARRRRGGGGVVVV pre-
- defined AWK variables.
-
- OOOOPPPPTTTTIIIIOOOONNNNSSSS
- _G_a_w_k options may be either the traditional POSIX one letter
- options, or the GNU style long options. POSIX style options
- start with a single ``-'', while GNU long options start with
- `` - -''. GNU style long options are provided for both GNU-
- specific features and for POSIX mandated features. Other
- implementations of the AWK language are likely to only
- accept the traditional one letter options.
-
- Following the POSIX standard, _g_a_w_k-specific options are sup-
- plied via arguments to the ----WWWW option. Multiple ----WWWW options
- may be supplied, or multiple arguments may be supplied
- together if they are separated by commas, or enclosed in
- quotes and separated by white space. Case is ignored in
- arguments to the ---- WWWW option. Each ---- WWWW option has a
- corresponding GNU style long option, as detailed below.
- Arguments to GNU style long options are either joined with
- the option by an ==== sign, with no intervening spaces, or they
- may be provided in the next command line argument.
-
- _G_a_w_k accepts the following options.
-
- ----FFFF _f_s
- --------ffffiiiieeeelllldddd----sssseeeeppppaaaarrrraaaattttoooorrrr====_f_s
- Use _f_s for the input field separator (the value of the
- FFFFSSSS predefined variable).
-
- ----vvvv _v_a_r====_v_a_l
-
-
-
- Free Software FoundationLast change: Apr 18 1994 1
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- --------aaaassssssssiiiiggggnnnn====_v_a_r====_v_a_l
- Assign the value _v_a_l, to the variable _v_a_r, before exe-
- cution of the program begins. Such variable values are
- available to the BBBBEEEEGGGGIIIINNNN block of an AWK program.
-
- ----ffff _p_r_o_g_r_a_m-_f_i_l_e
- --------ffffiiiilllleeee====_p_r_o_g_r_a_m-_f_i_l_e
- Read the AWK program source from the file _p_r_o_g_r_a_m-_f_i_l_e,
- instead of from the first command line argument. Mul-
- tiple ----ffff (or --------ffffiiiilllleeee) options may be used.
-
- ----mmmmffff====_N_N_N
- ----mmmmrrrr====_N_N_N
- Set various memory limits to the value _N_N_N. The ffff flag
- sets the maximum number of fields, and the rrrr flag sets
- the maximum record size. These two flags and the ---- mmmm
- option are from the AT&T Bell Labs research version of
- UNIX _a_w_k. They are ignored by _g_a_w_k, since _g_a_w_k has no
- pre-defined limits.
- ----WWWW ccccoooommmmppppaaaatttt
- --------ccccoooommmmppppaaaatttt Run in _c_o_m_p_a_t_i_b_i_l_i_t_y mode. In compatibility
- mode, _g_a_w_k behaves identically to UNIX _a_w_k; none
- of the GNU-specific extensions are recognized.
- See GGGGNNNNUUUU EEEEXXXXTTTTEEEENNNNSSSSIIIIOOOONNNNSSSS, below, for more information.
-
- ----WWWW ccccooooppppyyyylllleeeefffftttt
- ----WWWW ccccooooppppyyyyrrrriiiigggghhhhtttt
- --------ccccooooppppyyyylllleeeefffftttt
- --------ccccooooppppyyyyrrrriiiigggghhhhtttt Print the short version of the GNU copyright
- information message on the error output.
-
- ----WWWW hhhheeeellllpppp
- ----WWWW uuuussssaaaaggggeeee
- --------hhhheeeellllpppp
- --------uuuussssaaaaggggeeee Print a relatively short summary of the avail-
- able options on the error output. Per the GNU
- Coding Standards, these options cause an immedi-
- ate, successful exit.
-
- ----WWWW lllliiiinnnntttt
- --------lllliiiinnnntttt Provide warnings about constructs that are
- dubious or non-portable to other AWK implementa-
- tions.
- ----WWWW ppppoooossssiiiixxxx
- --------ppppoooossssiiiixxxx This turns on _c_o_m_p_a_t_i_b_i_l_i_t_y mode, with the
- following additional restrictions:
-
- +o \\\\xxxx escape sequences are not recognized.
-
- +o The synonym ffffuuuunnnncccc for the keyword ffffuuuunnnnccccttttiiiioooonnnn is
- not recognized.
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 2
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- +o The operators ******** and ********==== cannot be used in
- place of ^^^^ and ^^^^====.
-
- ----WWWW ssssoooouuuurrrrcccceeee====_p_r_o_g_r_a_m-_t_e_x_t
- --------ssssoooouuuurrrrcccceeee====_p_r_o_g_r_a_m-_t_e_x_t
- Use _p_r_o_g_r_a_m-_t_e_x_t as AWK program source code.
- This option allows the easy intermixing of
- library functions (used via the ----ffff and ---- ---- ffffiiiilllleeee
- options) with source code entered on the command
- line. It is intended primarily for medium to
- large size AWK programs used in shell scripts.
- The ----WWWW ssssoooouuuurrrrcccceeee==== form of this option uses the rest
- of the command line argument for _p_r_o_g_r_a_m-_t_e_x_t;
- no other options to ----WWWW will be recognized in the
- same argument.
-
- ----WWWW vvvveeeerrrrssssiiiioooonnnn
- --------vvvveeeerrrrssssiiiioooonnnn Print version information for this particular
- copy of _g_a_w_k on the error output. This is use-
- ful mainly for knowing if the current copy of
- _g_a_w_k on your system is up to date with respect
- to whatever the Free Software Foundation is dis-
- tributing. Per the GNU Coding Standards, these
- options cause an immediate, successful exit.
-
- -------- Signal the end of options. This is useful to
- allow further arguments to the AWK program
- itself to start with a ``-''. This is mainly
- for consistency with the argument parsing con-
- vention used by most other POSIX programs.
-
- In compatibility mode, any other options are flagged as
- illegal, but are otherwise ignored. In normal operation, as
- long as program text has been supplied, unknown options are
- passed on to the AWK program in the AAAARRRRGGGGVVVV array for process-
- ing. This is particularly useful for running AWK programs
- via the ``#!'' executable interpreter mechanism.
-
- AAAAWWWWKKKK PPPPRRRROOOOGGGGRRRRAAAAMMMM EEEEXXXXEEEECCCCUUUUTTTTIIIIOOOONNNN
- An AWK program consists of a sequence of pattern-action
- statements and optional function definitions.
-
- _p_a_t_t_e_r_n {{{{ _a_c_t_i_o_n _s_t_a_t_e_m_e_n_t_s }}}}
- ffffuuuunnnnccccttttiiiioooonnnn _n_a_m_e((((_p_a_r_a_m_e_t_e_r _l_i_s_t)))) {{{{ _s_t_a_t_e_m_e_n_t_s }}}}
-
- _G_a_w_k first reads the program source from the _p_r_o_g_r_a_m-_f_i_l_e(s)
- if specified, from arguments to ----WWWW ssssoooouuuurrrrcccceeee====, or from the
- first non-option argument on the command line. The ----ffff and ----
- WWWW ssssoooouuuurrrrcccceeee==== options may be used multiple times on the command
- line. _G_a_w_k will read the program text as if all the
- _p_r_o_g_r_a_m-_f_i_l_es and command line source texts had been con-
- catenated together. This is useful for building libraries
-
-
-
- Free Software FoundationLast change: Apr 18 1994 3
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- of AWK functions, without having to include them in each new
- AWK program that uses them. It also provides the ability to
- mix library functions with command line programs.
-
- The environment variable AAAAWWWWKKKKPPPPAAAATTTTHHHH specifies a search path to
- use when finding source files named with the ----ffff option. If
- this variable does not exist, the default path is
- """"....::::////llllooooccccaaaallll////lllliiiibbbb////aaaawwwwkkkk::::////ggggnnnnuuuu////lllliiiibbbb////aaaawwwwkkkk"""". If a file name given to
- the ----ffff option contains a ``/'' character, no path search is
- performed.
-
- _G_a_w_k executes AWK programs in the following order. First,
- all variable assignments specified via the ----vvvv option are
- performed. Next, _g_a_w_k compiles the program into an internal
- form. Then, _g_a_w_k executes the code in the BBBBEEEEGGGGIIIINNNN block(s)
- (if any), and then proceeds to read each file named in the
- AAAARRRRGGGGVVVV array. If there are no files named on the command
- line, _g_a_w_k reads the standard input.
-
- If a filename on the command line has the form _v_a_r====_v_a_l it is
- treated as a variable assignment. The variable _v_a_r will be
- assigned the value _v_a_l. (This happens after any BBBBEEEEGGGGIIIINNNN
- block(s) have been run.) Command line variable assignment
- is most useful for dynamically assigning values to the vari-
- ables AWK uses to control how input is broken into fields
- and records. It is also useful for controlling state if mul-
- tiple passes are needed over a single data file.
-
- If the value of a particular element of AAAARRRRGGGGVVVV is empty (""""""""),
- _g_a_w_k skips over it.
-
- For each line in the input, _g_a_w_k tests to see if it matches
- any _p_a_t_t_e_r_n in the AWK program. For each pattern that the
- line matches, the associated _a_c_t_i_o_n is executed. The pat-
- terns are tested in the order they occur in the program.
-
- Finally, after all the input is exhausted, _g_a_w_k executes the
- code in the EEEENNNNDDDD block(s) (if any).
-
- VVVVAAAARRRRIIIIAAAABBBBLLLLEEEESSSS AAAANNNNDDDD FFFFIIIIEEEELLLLDDDDSSSS
- AWK variables are dynamic; they come into existence when
- they are first used. Their values are either floating-point
- numbers or strings, or both, depending upon how they are
- used. AWK also has one dimensional arrays; arrays with mul-
- tiple dimensions may be simulated. Several pre-defined
- variables are set as a program runs; these will be described
- as needed and summarized below.
-
- FFFFiiiieeeellllddddssss
- As each input line is read, _g_a_w_k splits the line into
- _f_i_e_l_d_s, using the value of the FFFFSSSS variable as the field
- separator. If FFFFSSSS is a single character, fields are
-
-
-
- Free Software FoundationLast change: Apr 18 1994 4
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- separated by that character. Otherwise, FFFFSSSS is expected to
- be a full regular expression. In the special case that FFFFSSSS
- is a single blank, fields are separated by runs of blanks
- and/or tabs. Note that the value of IIIIGGGGNNNNOOOORRRREEEECCCCAAAASSSSEEEE (see below)
- will also affect how fields are split when FFFFSSSS is a regular
- expression.
-
- If the FFFFIIIIEEEELLLLDDDDWWWWIIIIDDDDTTTTHHHHSSSS variable is set to a space separated list
- of numbers, each field is expected to have fixed width, and
- _g_a_w_k will split up the record using the specified widths.
- The value of FFFFSSSS is ignored. Assigning a new value to FFFFSSSS
- overrides the use of FFFFIIIIEEEELLLLDDDDWWWWIIIIDDDDTTTTHHHHSSSS, and restores the default
- behavior.
-
- Each field in the input line may be referenced by its posi-
- tion, $$$$1111, $$$$2222, and so on. $$$$0000 is the whole line. The value of
- a field may be assigned to as well. Fields need not be
- referenced by constants:
-
- nnnn ==== 5555
- pppprrrriiiinnnntttt $$$$nnnn
-
- prints the fifth field in the input line. The variable NNNNFFFF
- is set to the total number of fields in the input line.
-
- References to non-existent fields (i.e. fields after $$$$NNNNFFFF)
- produce the null-string. However, assigning to a non-
- existent field (e.g., $$$$((((NNNNFFFF++++2222)))) ==== 5555) will increase the value
- of NNNNFFFF, create any intervening fields with the null string as
- their value, and cause the value of $$$$0000 to be recomputed,
- with the fields being separated by the value of OOOOFFFFSSSS. Refer-
- ences to negative numbered fields cause a fatal error.
-
- BBBBuuuuiiiilllltttt----iiiinnnn VVVVaaaarrrriiiiaaaabbbblllleeeessss
- AWK's built-in variables are:
-
- AAAARRRRGGGGCCCC The number of command line arguments (does not
- include options to _g_a_w_k, or the program source).
-
- AAAARRRRGGGGIIIINNNNDDDD The index in AAAARRRRGGGGVVVV of the current file being pro-
- cessed.
-
- AAAARRRRGGGGVVVV Array of command line arguments. The array is
- indexed from 0 to AAAARRRRGGGGCCCC - 1. Dynamically chang-
- ing the contents of AAAARRRRGGGGVVVV can control the files
- used for data.
-
- CCCCOOOONNNNVVVVFFFFMMMMTTTT The conversion format for numbers, """"%%%%....6666gggg"""", by
- default.
-
- EEEENNNNVVVVIIIIRRRROOOONNNN An array containing the values of the current
- environment. The array is indexed by the
-
-
-
- Free Software FoundationLast change: Apr 18 1994 5
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- environment variables, each element being the
- value of that variable (e.g., EEEENNNNVVVVIIIIRRRROOOONNNN[[[[""""HHHHOOOOMMMMEEEE""""]]]]
- might be ////uuuu////aaaarrrrnnnnoooolllldddd). Changing this array does
- not affect the environment seen by programs
- which _g_a_w_k spawns via redirection or the ssssyyyyssss----
- tttteeeemmmm(((()))) function. (This may change in a future
- version of _g_a_w_k.)
-
- EEEERRRRRRRRNNNNOOOO If a system error occurs either doing a redirec-
- tion for ggggeeeettttlllliiiinnnneeee, during a read for ggggeeeettttlllliiiinnnneeee, or
- during a cccclllloooosssseeee(((()))), then EEEERRRRRRRRNNNNOOOO will contain a
- string describing the error.
-
- FFFFIIIIEEEELLLLDDDDWWWWIIIIDDDDTTTTHHHHSSSS A white-space separated list of fieldwidths.
- When set, _g_a_w_k parses the input into fields of
- fixed width, instead of using the value of the
- FFFFSSSS variable as the field separator. The fixed
- field width facility is still experimental;
- expect the semantics to change as _g_a_w_k evolves
- over time.
-
- FFFFIIIILLLLEEEENNNNAAAAMMMMEEEE The name of the current input file. If no files
- are specified on the command line, the value of
- FFFFIIIILLLLEEEENNNNAAAAMMMMEEEE is ``-''. However, FFFFIIIILLLLEEEENNNNAAAAMMMMEEEE is unde-
- fined inside the BBBBEEEEGGGGIIIINNNN block.
-
- FFFFNNNNRRRR The input record number in the current input
- file.
-
- FFFFSSSS The input field separator, a blank by default.
-
- IIIIGGGGNNNNOOOORRRREEEECCCCAAAASSSSEEEE Controls the case-sensitivity of all regular
- expression operations. If IIIIGGGGNNNNOOOORRRREEEECCCCAAAASSSSEEEE has a non-
- zero value, then pattern matching in rules,
- field splitting with FFFFSSSS, regular expression
- matching with ~~~~ and !!!!~~~~, and the ggggssssuuuubbbb(((()))), iiiinnnnddddeeeexxxx(((()))),
- mmmmaaaattttcccchhhh(((()))), sssspppplllliiiitttt(((()))), and ssssuuuubbbb(((()))) pre-defined func-
- tions will all ignore case when doing regular
- expression operations. Thus, if IIIIGGGGNNNNOOOORRRREEEECCCCAAAASSSSEEEE is
- not equal to zero, ////aaaaBBBB//// matches all of the
- strings """"aaaabbbb"""", """"aaaaBBBB"""", """"AAAAbbbb"""", and """"AAAABBBB"""". As with all
- AWK variables, the initial value of IIIIGGGGNNNNOOOORRRREEEECCCCAAAASSSSEEEE
- is zero, so all regular expression operations
- are normally case-sensitive.
-
- NNNNFFFF The number of fields in the current input
- record.
-
- NNNNRRRR The total number of input records seen so far.
-
- OOOOFFFFMMMMTTTT The output format for numbers, """"%%%%....6666gggg"""", by
- default.
-
-
-
- Free Software FoundationLast change: Apr 18 1994 6
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- OOOOFFFFSSSS The output field separator, a blank by default.
-
- OOOORRRRSSSS The output record separator, by default a new-
- line.
-
- RRRRSSSS The input record separator, by default a new-
- line. RRRRSSSS is exceptional in that only the first
- character of its string value is used for
- separating records. (This will probably change
- in a future release of _g_a_w_k.) If RRRRSSSS is set to
- the null string, then records are separated by
- blank lines. When RRRRSSSS is set to the null string,
- then the newline character always acts as a
- field separator, in addition to whatever value
- FFFFSSSS may have.
-
- RRRRSSSSTTTTAAAARRRRTTTT The index of the first character matched by
- mmmmaaaattttcccchhhh(((()))); 0 if no match.
-
- RRRRLLLLEEEENNNNGGGGTTTTHHHH The length of the string matched by mmmmaaaattttcccchhhh(((()))); - 1
- if no match.
-
- SSSSUUUUBBBBSSSSEEEEPPPP The character used to separate multiple sub-
- scripts in array elements, by default """"\\\\000033334444"""".
-
- AAAArrrrrrrraaaayyyyssss
- Arrays are subscripted with an expression between square
- brackets ([[[[ and ]]]]). If the expression is an expression list
- (_e_x_p_r, _e_x_p_r ...) then the array subscript is a string con-
- sisting of the concatenation of the (string) value of each
- expression, separated by the value of the SSSSUUUUBBBBSSSSEEEEPPPP variable.
- This facility is used to simulate multiply dimensioned
- arrays. For example:
-
- iiii ==== """"AAAA"""" ;;;; jjjj ==== """"BBBB"""" ;;;; kkkk ==== """"CCCC""""
- xxxx[[[[iiii,,,, jjjj,,,, kkkk]]]] ==== """"hhhheeeelllllllloooo,,,, wwwwoooorrrrlllldddd\\\\nnnn""""
-
- assigns the string """"hhhheeeelllllllloooo,,,, wwwwoooorrrrlllldddd\\\\nnnn"""" to the element of the
- array xxxx which is indexed by the string """"AAAA\\\\000033334444BBBB\\\\000033334444CCCC"""". All
- arrays in AWK are associative, i.e. indexed by string
- values.
-
- The special operator iiiinnnn may be used in an iiiiffff or wwwwhhhhiiiilllleeee state-
- ment to see if an array has an index consisting of a partic-
- ular value.
-
- iiiiffff ((((vvvvaaaallll iiiinnnn aaaarrrrrrrraaaayyyy))))
- pppprrrriiiinnnntttt aaaarrrrrrrraaaayyyy[[[[vvvvaaaallll]]]]
-
- If the array has multiple subscripts, use ((((iiii,,,, jjjj)))) iiiinnnn aaaarrrrrrrraaaayyyy.
-
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 7
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- The iiiinnnn construct may also be used in a ffffoooorrrr loop to iterate
- over all the elements of an array.
-
- An element may be deleted from an array using the ddddeeeelllleeeetttteeee
- statement. The ddddeeeelllleeeetttteeee statement may also be used to delete
- the entire contents of an array.
-
- VVVVaaaarrrriiiiaaaabbbblllleeee TTTTyyyyppppiiiinnnngggg AAAAnnnndddd CCCCoooonnnnvvvveeeerrrrssssiiiioooonnnn
- Variables and fields may be (floating point) numbers, or
- strings, or both. How the value of a variable is interpreted
- depends upon its context. If used in a numeric expression,
- it will be treated as a number, if used as a string it will
- be treated as a string.
-
- To force a variable to be treated as a number, add 0 to it;
- to force it to be treated as a string, concatenate it with
- the null string.
-
- When a string must be converted to a number, the conversion
- is accomplished using _a_t_o_f(3). A number is converted to a
- string by using the value of CCCCOOOONNNNVVVVFFFFMMMMTTTT as a format string for
- _s_p_r_i_n_t_f(3), with the numeric value of the variable as the
- argument. However, even though all numbers in AWK are
- floating-point, integral values are _a_l_w_a_y_s converted as
- integers. Thus, given
-
- CCCCOOOONNNNVVVVFFFFMMMMTTTT ==== """"%%%%2222....2222ffff""""
- aaaa ==== 11112222
- bbbb ==== aaaa """"""""
-
- the variable bbbb has a string value of """"11112222"""" and not """"11112222....00000000"""".
-
- _G_a_w_k performs comparisons as follows: If two variables are
- numeric, they are compared numerically. If one value is
- numeric and the other has a string value that is a ``numeric
- string,'' then comparisons are also done numerically. Oth-
- erwise, the numeric value is converted to a string and a
- string comparison is performed. Two strings are compared,
- of course, as strings. According to the POSIX standard,
- even if two strings are numeric strings, a numeric com-
- parison is performed. However, this is clearly incorrect,
- and _g_a_w_k does not do this.
-
- Uninitialized variables have the numeric value 0 and the
- string value "" (the null, or empty, string).
-
- PPPPAAAATTTTTTTTEEEERRRRNNNNSSSS AAAANNNNDDDD AAAACCCCTTTTIIIIOOOONNNNSSSS
- AWK is a line oriented language. The pattern comes first,
- and then the action. Action statements are enclosed in {{{{ and
- }}}}. Either the pattern may be missing, or the action may be
- missing, but, of course, not both. If the pattern is miss-
- ing, the action will be executed for every single line of
-
-
-
- Free Software FoundationLast change: Apr 18 1994 8
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- input. A missing action is equivalent to
-
- {{{{ pppprrrriiiinnnntttt }}}}
-
- which prints the entire line.
-
- Comments begin with the ``#'' character, and continue until
- the end of the line. Blank lines may be used to separate
- statements. Normally, a statement ends with a newline, how-
- ever, this is not the case for lines ending in a ``,'',
- ``{'', ``?'', ``:'', ``&&'', or ``||''. Lines ending in ddddoooo
- or eeeellllsssseeee also have their statements automatically continued
- on the following line. In other cases, a line can be con-
- tinued by ending it with a ``\'', in which case the newline
- will be ignored.
-
- Multiple statements may be put on one line by separating
- them with a ``;''. This applies to both the statements
- within the action part of a pattern-action pair (the usual
- case), and to the pattern-action statements themselves.
-
- PPPPaaaatttttttteeeerrrrnnnnssss
- AWK patterns may be one of the following:
-
- BBBBEEEEGGGGIIIINNNN
- EEEENNNNDDDD
- ////_r_e_g_u_l_a_r _e_x_p_r_e_s_s_i_o_n////
- _r_e_l_a_t_i_o_n_a_l _e_x_p_r_e_s_s_i_o_n
- _p_a_t_t_e_r_n &&&&&&&& _p_a_t_t_e_r_n
- _p_a_t_t_e_r_n |||||||| _p_a_t_t_e_r_n
- _p_a_t_t_e_r_n ???? _p_a_t_t_e_r_n :::: _p_a_t_t_e_r_n
- ((((_p_a_t_t_e_r_n))))
- !!!! _p_a_t_t_e_r_n
- _p_a_t_t_e_r_n_1,,,, _p_a_t_t_e_r_n_2
-
- BBBBEEEEGGGGIIIINNNN and EEEENNNNDDDD are two special kinds of patterns which are
- not tested against the input. The action parts of all BBBBEEEEGGGGIIIINNNN
- patterns are merged as if all the statements had been writ-
- ten in a single BBBBEEEEGGGGIIIINNNN block. They are executed before any of
- the input is read. Similarly, all the EEEENNNNDDDD blocks are merged,
- and executed when all the input is exhausted (or when an
- eeeexxxxiiiitttt statement is executed). BBBBEEEEGGGGIIIINNNN and EEEENNNNDDDD patterns cannot
- be combined with other patterns in pattern expressions.
- BBBBEEEEGGGGIIIINNNN and EEEENNNNDDDD patterns cannot have missing action parts.
-
- For ////_r_e_g_u_l_a_r _e_x_p_r_e_s_s_i_o_n//// patterns, the associated statement
- is executed for each input line that matches the regular
- expression. Regular expressions are the same as those in
- _e_g_r_e_p(1), and are summarized below.
-
- A _r_e_l_a_t_i_o_n_a_l _e_x_p_r_e_s_s_i_o_n may use any of the operators defined
- below in the section on actions. These generally test
-
-
-
- Free Software FoundationLast change: Apr 18 1994 9
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- whether certain fields match certain regular expressions.
-
- The &&&&&&&&, ||||||||, and !!!! operators are logical AND, logical OR, and
- logical NOT, respectively, as in C. They do short-circuit
- evaluation, also as in C, and are used for combining more
- primitive pattern expressions. As in most languages,
- parentheses may be used to change the order of evaluation.
-
- The ????:::: operator is like the same operator in C. If the first
- pattern is true then the pattern used for testing is the
- second pattern, otherwise it is the third. Only one of the
- second and third patterns is evaluated.
-
- The _p_a_t_t_e_r_n_1,,,, _p_a_t_t_e_r_n_2 form of an expression is called a
- _r_a_n_g_e _p_a_t_t_e_r_n. It matches all input records starting with a
- line that matches _p_a_t_t_e_r_n_1, and continuing until a record
- that matches _p_a_t_t_e_r_n_2, inclusive. It does not combine with
- any other sort of pattern expression.
-
- RRRReeeegggguuuullllaaaarrrr EEEExxxxpppprrrreeeessssssssiiiioooonnnnssss
- Regular expressions are the extended kind found in _e_g_r_e_p.
- They are composed of characters as follows:
-
- _c matches the non-metacharacter _c.
-
- \_c matches the literal character _c.
-
- .... matches any character except newline.
-
- ^^^^ matches the beginning of a line or a string.
-
- $$$$ matches the end of a line or a string.
-
- [[[[_a_b_c...]]]] character class, matches any of the characters
- _a_b_c....
-
- [[[[^^^^_a_b_c...]]]] negated character class, matches any character
- except _a_b_c... and newline.
-
- _r_1||||_r_2 alternation: matches either _r_1 or _r_2.
-
- _r_1_r_2 concatenation: matches _r_1, and then _r_2.
-
- _r++++ matches one or more _r's.
-
- _r**** matches zero or more _r's.
-
- _r???? matches zero or one _r's.
-
- ((((_r)))) grouping: matches _r.
-
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 10
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- The escape sequences that are valid in string constants (see
- below) are also legal in regular expressions.
-
- AAAAccccttttiiiioooonnnnssss
- Action statements are enclosed in braces, {{{{ and }}}}. Action
- statements consist of the usual assignment, conditional, and
- looping statements found in most languages. The operators,
- control statements, and input/output statements available
- are patterned after those in C.
-
- OOOOppppeeeerrrraaaattttoooorrrrssss
- The operators in AWK, in order of increasing precedence, are
-
- ==== ++++==== ----====
- ****==== ////==== %%%%==== ^^^^==== Assignment. Both absolute assignment ((((_v_a_r ====
- _v_a_l_u_e)))) and operator-assignment (the other forms)
- are supported.
-
- ????:::: The C conditional expression. This has the form
- _e_x_p_r_1 ???? _e_x_p_r_2 :::: _e_x_p_r_3. If _e_x_p_r_1 is true, the
- value of the expression is _e_x_p_r_2, otherwise it
- is _e_x_p_r_3. Only one of _e_x_p_r_2 and _e_x_p_r_3 is
- evaluated.
-
- |||||||| Logical OR.
-
- &&&&&&&& Logical AND.
-
- ~~~~ !!!!~~~~ Regular expression match, negated match. NNNNOOOOTTTTEEEE::::
- Do not use a constant regular expression (////ffffoooooooo////)
- on the left-hand side of a ~~~~ or !!!!~~~~. Only use
- one on the right-hand side. The expression
- ////ffffoooooooo//// ~~~~ _e_x_p has the same meaning as (((((((($$$$0000 ~~~~
- ////ffffoooooooo////)))) ~~~~ _e_x_p)))). This is usually _n_o_t what was
- intended.
-
- <<<< >>>>
- <<<<==== >>>>====
- !!!!==== ======== The regular relational operators.
-
- _b_l_a_n_k String concatenation.
-
- ++++ ---- Addition and subtraction.
-
- **** //// %%%% Multiplication, division, and modulus.
-
- ++++ ---- !!!! Unary plus, unary minus, and logical negation.
-
- ^^^^ Exponentiation (******** may also be used, and ********==== for
- the assignment operator).
-
- ++++++++ ---- ---- Increment and decrement, both prefix and
-
-
-
- Free Software FoundationLast change: Apr 18 1994 11
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- postfix.
-
- $$$$ Field reference.
-
- CCCCoooonnnnttttrrrroooollll SSSSttttaaaatttteeeemmmmeeeennnnttttssss
- The control statements are as follows:
-
- iiiiffff ((((_c_o_n_d_i_t_i_o_n)))) _s_t_a_t_e_m_e_n_t [ eeeellllsssseeee _s_t_a_t_e_m_e_n_t ]
- wwwwhhhhiiiilllleeee ((((_c_o_n_d_i_t_i_o_n)))) _s_t_a_t_e_m_e_n_t
- ddddoooo _s_t_a_t_e_m_e_n_t wwwwhhhhiiiilllleeee ((((_c_o_n_d_i_t_i_o_n))))
- ffffoooorrrr ((((_e_x_p_r_1;;;; _e_x_p_r_2;;;; _e_x_p_r_3)))) _s_t_a_t_e_m_e_n_t
- ffffoooorrrr ((((_v_a_r iiiinnnn _a_r_r_a_y)))) _s_t_a_t_e_m_e_n_t
- bbbbrrrreeeeaaaakkkk
- ccccoooonnnnttttiiiinnnnuuuueeee
- ddddeeeelllleeeetttteeee _a_r_r_a_y[[[[_i_n_d_e_x]]]]
- ddddeeeelllleeeetttteeee _a_r_r_a_y
- eeeexxxxiiiitttt [ _e_x_p_r_e_s_s_i_o_n ]
- {{{{ _s_t_a_t_e_m_e_n_t_s }}}}
-
- IIII////OOOO SSSSttttaaaatttteeeemmmmeeeennnnttttssss
- The input/output statements are as follows:
-
- cccclllloooosssseeee((((_f_i_l_e_n_a_m_e)))) Close file (or pipe, see below).
-
- ggggeeeettttlllliiiinnnneeee Set $$$$0000 from next input record; set NNNNFFFF,
- NNNNRRRR, FFFFNNNNRRRR.
-
- ggggeeeettttlllliiiinnnneeee <<<<_f_i_l_e Set $$$$0000 from next record of _f_i_l_e; set
- NNNNFFFF.
-
- ggggeeeettttlllliiiinnnneeee _v_a_r Set _v_a_r from next input record; set
- NNNNFFFF, FFFFNNNNRRRR.
-
- ggggeeeettttlllliiiinnnneeee _v_a_r <<<<_f_i_l_e Set _v_a_r from next record of _f_i_l_e.
-
- nnnneeeexxxxtttt Stop processing the current input
- record. The next input record is read
- and processing starts over with the
- first pattern in the AWK program. If
- the end of the input data is reached,
- the EEEENNNNDDDD block(s), if any, are exe-
- cuted.
-
- nnnneeeexxxxtttt ffffiiiilllleeee Stop processing the current input
- file. The next input record read
- comes from the next input file.
- FFFFIIIILLLLEEEENNNNAAAAMMMMEEEE is updated, FFFFNNNNRRRR is reset to
- 1, and processing starts over with the
- first pattern in the AWK program. If
- the end of the input data is reached,
- the EEEENNNNDDDD block(s), if any, are exe-
- cuted.
-
-
-
- Free Software FoundationLast change: Apr 18 1994 12
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- pppprrrriiiinnnntttt Prints the current record.
-
- pppprrrriiiinnnntttt _e_x_p_r-_l_i_s_t Prints expressions. Each expression
- is separated by the value of the OOOOFFFFSSSS
- variable. The output record is ter-
- minated with the value of the OOOORRRRSSSS
- variable.
-
- pppprrrriiiinnnntttt _e_x_p_r-_l_i_s_t >>>>_f_i_l_e Prints expressions on _f_i_l_e. Each
- expression is separated by the value
- of the OOOOFFFFSSSS variable. The output record
- is terminated with the value of the
- OOOORRRRSSSS variable.
-
- pppprrrriiiinnnnttttffff _f_m_t, _e_x_p_r-_l_i_s_t Format and print.
-
- pppprrrriiiinnnnttttffff _f_m_t, _e_x_p_r-_l_i_s_t >>>>_f_i_l_e
- Format and print on _f_i_l_e.
-
- ssssyyyysssstttteeeemmmm((((_c_m_d-_l_i_n_e)))) Execute the command _c_m_d-_l_i_n_e, and
- return the exit status. (This may not
- be available on non-POSIX systems.)
-
- Other input/output redirections are also allowed. For pppprrrriiiinnnntttt
- and pppprrrriiiinnnnttttffff, >>>>>>>>_f_i_l_e appends output to the _f_i_l_e, while |||| _c_o_m_-
- _m_a_n_d writes on a pipe. In a similar fashion, _c_o_m_m_a_n_d |||| ggggeeeetttt----
- lllliiiinnnneeee pipes into ggggeeeettttlllliiiinnnneeee. The ggggeeeettttlllliiiinnnneeee command will return 0
- on end of file, and -1 on an error.
-
- TTTThhhheeee _p_r_i_n_t_f SSSSttttaaaatttteeeemmmmeeeennnntttt
- The AWK versions of the pppprrrriiiinnnnttttffff statement and sssspppprrrriiiinnnnttttffff(((()))) func-
- tion (see below) accept the following conversion specifica-
- tion formats:
-
- %%%%cccc An ASCII character. If the argument used for %%%%cccc is
- numeric, it is treated as a character and printed.
- Otherwise, the argument is assumed to be a string, and
- the only first character of that string is printed.
-
- %%%%dddd A decimal number (the integer part).
-
- %%%%iiii Just like %%%%dddd.
-
- %%%%eeee A floating point number of the form [[[[----]]]]dddd....ddddddddddddddddddddddddEEEE[[[[++++----]]]]dddddddd.
-
- %%%%ffff A floating point number of the form [[[[----]]]]dddddddddddd....dddddddddddddddddddddddd.
-
- %%%%gggg Use eeee or ffff conversion, whichever is shorter, with non-
- significant zeros suppressed.
-
- %%%%oooo An unsigned octal number (again, an integer).
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 13
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- %%%%ssss A character string.
-
- %%%%xxxx An unsigned hexadecimal number (an integer).
-
- %%%%XXXX Like %%%%xxxx, but using AAAABBBBCCCCDDDDEEEEFFFF instead of aaaabbbbccccddddeeeeffff.
-
- %%%%%%%% A single %%%% character; no argument is converted.
-
- There are optional, additional parameters that may lie
- between the %%%% and the control letter:
-
- ---- The expression should be left-justified within its
- field.
-
- _w_i_d_t_h
- The field should be padded to this width. If the number
- has a leading zero, then the field will be padded with
- zeros. Otherwise it is padded with blanks. This
- applies even to the non-numeric output formats.
-
- ...._p_r_e_c
- A number indicating the maximum width of strings or
- digits to the right of the decimal point.
-
- The dynamic _w_i_d_t_h and _p_r_e_c capabilities of the ANSI C
- pppprrrriiiinnnnttttffff(((()))) routines are supported. A **** in place of either the
- wwwwiiiiddddtttthhhh or pppprrrreeeecccc specifications will cause their values to be
- taken from the argument list to pppprrrriiiinnnnttttffff or sssspppprrrriiiinnnnttttffff(((()))).
-
- SSSSppppeeeecccciiiiaaaallll FFFFiiiilllleeee NNNNaaaammmmeeeessss
- When doing I/O redirection from either pppprrrriiiinnnntttt or pppprrrriiiinnnnttttffff into
- a file, or via ggggeeeettttlllliiiinnnneeee from a file, _g_a_w_k recognizes certain
- special filenames internally. These filenames allow access
- to open file descriptors inherited from _g_a_w_k's parent pro-
- cess (usually the shell). Other special filenames provide
- access information about the running ggggaaaawwwwkkkk process. The
- filenames are:
-
- ////ddddeeeevvvv////ppppiiiidddd Reading this file returns the process ID of the
- current process, in decimal, terminated with a
- newline.
-
- ////ddddeeeevvvv////ppppppppiiiidddd Reading this file returns the parent process ID
- of the current process, in decimal, terminated
- with a newline.
-
- ////ddddeeeevvvv////ppppggggrrrrppppiiiidddd Reading this file returns the process group ID
- of the current process, in decimal, terminated
- with a newline.
-
- ////ddddeeeevvvv////uuuusssseeeerrrr Reading this file returns a single record ter-
- minated with a newline. The fields are
-
-
-
- Free Software FoundationLast change: Apr 18 1994 14
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- separated with blanks. $$$$1111 is the value of the
- _g_e_t_u_i_d(2) system call, $$$$2222 is the value of the
- _g_e_t_e_u_i_d(2) system call, $$$$3333 is the value of the
- _g_e_t_g_i_d(2) system call, and $$$$4444 is the value of
- the _g_e_t_e_g_i_d(2) system call. If there are any
- additional fields, they are the group IDs
- returned by _g_e_t_g_r_o_u_p_s(2). Multiple groups may
- not be supported on all systems.
-
- ////ddddeeeevvvv////ssssttttddddiiiinnnn The standard input.
-
- ////ddddeeeevvvv////ssssttttddddoooouuuutttt The standard output.
-
- ////ddddeeeevvvv////ssssttttddddeeeerrrrrrrr The standard error output.
-
- ////ddddeeeevvvv////ffffdddd////_n The file associated with the open file descrip-
- tor _n.
-
- These are particularly useful for error messages. For exam-
- ple:
-
- pppprrrriiiinnnntttt """"YYYYoooouuuu bbbblllleeeewwww iiiitttt!!!!"""" >>>> """"////ddddeeeevvvv////ssssttttddddeeeerrrrrrrr""""
-
- whereas you would otherwise have to use
-
- pppprrrriiiinnnntttt """"YYYYoooouuuu bbbblllleeeewwww iiiitttt!!!!"""" |||| """"ccccaaaatttt 1111>>>>&&&&2222""""
-
- These file names may also be used on the command line to
- name data files.
-
- NNNNuuuummmmeeeerrrriiiicccc FFFFuuuunnnnccccttttiiiioooonnnnssss
- AWK has the following pre-defined arithmetic functions:
-
- aaaattttaaaannnn2222((((_y,,,, _x)))) returns the arctangent of _y/_x in radians.
-
- ccccoooossss((((_e_x_p_r)))) returns the cosine in radians.
-
- eeeexxxxpppp((((_e_x_p_r)))) the exponential function.
-
- iiiinnnntttt((((_e_x_p_r)))) truncates to integer.
-
- lllloooogggg((((_e_x_p_r)))) the natural logarithm function.
-
- rrrraaaannnndddd(((()))) returns a random number between 0 and 1.
-
- ssssiiiinnnn((((_e_x_p_r)))) returns the sine in radians.
-
- ssssqqqqrrrrtttt((((_e_x_p_r)))) the square root function.
-
- ssssrrrraaaannnndddd((((_e_x_p_r)))) use _e_x_p_r as a new seed for the random number
- generator. If no _e_x_p_r is provided, the time of
- day will be used. The return value is the
-
-
-
- Free Software FoundationLast change: Apr 18 1994 15
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- previous seed for the random number generator.
-
- SSSSttttrrrriiiinnnngggg FFFFuuuunnnnccccttttiiiioooonnnnssss
- AWK has the following pre-defined string functions:
-
- ggggssssuuuubbbb((((_r,,,, _s,,,, _t)))) for each substring matching the reg-
- ular expression _r in the string _t,
- substitute the string _s, and return
- the number of substitutions. If _t
- is not supplied, use $$$$0000.
-
- iiiinnnnddddeeeexxxx((((_s,,,, _t)))) returns the index of the string _t in
- the string _s, or 0 if _t is not
- present.
-
- lllleeeennnnggggtttthhhh((((_s)))) returns the length of the string _s,
- or the length of $$$$0000 if _s is not sup-
- plied.
-
- mmmmaaaattttcccchhhh((((_s,,,, _r)))) returns the position in _s where the
- regular expression _r occurs, or 0 if
- _r is not present, and sets the
- values of RRRRSSSSTTTTAAAARRRRTTTT and RRRRLLLLEEEENNNNGGGGTTTTHHHH.
-
- sssspppplllliiiitttt((((_s,,,, _a,,,, _r)))) splits the string _s into the array _a
- on the regular expression _r, and
- returns the number of fields. If _r
- is omitted, FFFFSSSS is used instead. The
- array _a is cleared first.
-
- sssspppprrrriiiinnnnttttffff((((_f_m_t,,,, _e_x_p_r-_l_i_s_t)))) prints _e_x_p_r-_l_i_s_t according to _f_m_t,
- and returns the resulting string.
-
- ssssuuuubbbb((((_r,,,, _s,,,, _t)))) just like ggggssssuuuubbbb(((()))), but only the first
- matching substring is replaced.
-
- ssssuuuubbbbssssttttrrrr((((_s,,,, _i,,,, _n)))) returns the _n-character substring of
- _s starting at _i. If _n is omitted,
- the rest of _s is used.
-
- ttttoooolllloooowwwweeeerrrr((((_s_t_r)))) returns a copy of the string _s_t_r,
- with all the upper-case characters
- in _s_t_r translated to their
- corresponding lower-case counter-
- parts. Non-alphabetic characters
- are left unchanged.
-
- ttttoooouuuuppppppppeeeerrrr((((_s_t_r)))) returns a copy of the string _s_t_r,
- with all the lower-case characters
- in _s_t_r translated to their
- corresponding upper-case counter-
- parts. Non-alphabetic characters
-
-
-
- Free Software FoundationLast change: Apr 18 1994 16
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- are left unchanged.
-
- TTTTiiiimmmmeeee FFFFuuuunnnnccccttttiiiioooonnnnssss
- Since one of the primary uses of AWK programs is processing
- log files that contain time stamp information, _g_a_w_k provides
- the following two functions for obtaining time stamps and
- formatting them.
-
- ssssyyyyssssttttiiiimmmmeeee(((()))) returns the current time of day as the number of
- seconds since the Epoch (Midnight UTC, January 1,
- 1970 on POSIX systems).
-
- ssssttttrrrrffffttttiiiimmmmeeee((((_f_o_r_m_a_t, _t_i_m_e_s_t_a_m_p))))
- formats _t_i_m_e_s_t_a_m_p according to the specification
- in _f_o_r_m_a_t. The _t_i_m_e_s_t_a_m_p should be of the same
- form as returned by ssssyyyyssssttttiiiimmmmeeee(((()))). If _t_i_m_e_s_t_a_m_p is
- missing, the current time of day is used. See the
- specification for the ssssttttrrrrffffttttiiiimmmmeeee(((()))) function in ANSI
- C for the format conversions that are guaranteed
- to be available. A public-domain version of
- _s_t_r_f_t_i_m_e(3) and a man page for it are shipped with
- _g_a_w_k; if that version was used to build _g_a_w_k, then
- all of the conversions described in that man page
- are available to _g_a_w_k.
-
- SSSSttttrrrriiiinnnngggg CCCCoooonnnnssssttttaaaannnnttttssss
- String constants in AWK are sequences of characters enclosed
- between double quotes (""""). Within strings, certain _e_s_c_a_p_e
- _s_e_q_u_e_n_c_e_s are recognized, as in C. These are:
-
- \\\\\\\\ A literal backslash.
-
- \\\\aaaa The ``alert'' character; usually the ASCII BEL charac-
- ter.
-
- \\\\bbbb backspace.
-
- \\\\ffff form-feed.
-
- \\\\nnnn new line.
-
- \\\\rrrr carriage return.
-
- \\\\tttt horizontal tab.
-
- \\\\vvvv vertical tab.
-
- \\\\xxxx_h_e_x _d_i_g_i_t_s
- The character represented by the string of hexadecimal
- digits following the \\\\xxxx. As in ANSI C, all following
- hexadecimal digits are considered part of the escape
- sequence. (This feature should tell us something about
-
-
-
- Free Software FoundationLast change: Apr 18 1994 17
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- language design by committee.) E.g., """"\\\\xxxx1111BBBB"""" is the
- ASCII ESC (escape) character.
-
- \\\\_d_d_d The character represented by the 1-, 2-, or 3-digit
- sequence of octal digits. E.g. """"\\\\000033333333"""" is the ASCII ESC
- (escape) character.
-
- \\\\_c The literal character _c.
-
- The escape sequences may also be used inside constant regu-
- lar expressions (e.g., ////[[[[ \\\\tttt\\\\ffff\\\\nnnn\\\\rrrr\\\\vvvv]]]]//// matches whitespace
- characters).
-
- FFFFUUUUNNNNCCCCTTTTIIIIOOOONNNNSSSS
- Functions in AWK are defined as follows:
-
- ffffuuuunnnnccccttttiiiioooonnnn _n_a_m_e((((_p_a_r_a_m_e_t_e_r _l_i_s_t)))) {{{{ _s_t_a_t_e_m_e_n_t_s }}}}
-
- Functions are executed when called from within the action
- parts of regular pattern-action statements. Actual parame-
- ters supplied in the function call are used to instantiate
- the formal parameters declared in the function. Arrays are
- passed by reference, other variables are passed by value.
-
- Since functions were not originally part of the AWK
- language, the provision for local variables is rather
- clumsy: They are declared as extra parameters in the parame-
- ter list. The convention is to separate local variables from
- real parameters by extra spaces in the parameter list. For
- example:
-
- ffffuuuunnnnccccttttiiiioooonnnn ffff((((pppp,,,, qqqq,,,, aaaa,,,, bbbb)))) {{{{ #### aaaa &&&& bbbb aaaarrrreeee llllooooccccaaaallll
- .................... }}}}
-
- ////aaaabbbbcccc//// {{{{ ............ ;;;; ffff((((1111,,,, 2222)))) ;;;; ............ }}}}
-
- The left parenthesis in a function call is required to
- immediately follow the function name, without any interven-
- ing white space. This is to avoid a syntactic ambiguity
- with the concatenation operator. This restriction does not
- apply to the built-in functions listed above.
-
- Functions may call each other and may be recursive. Func-
- tion parameters used as local variables are initialized to
- the null string and the number zero upon function invoca-
- tion.
-
- The word ffffuuuunnnncccc may be used in place of ffffuuuunnnnccccttttiiiioooonnnn.
-
- EEEEXXXXAAAAMMMMPPPPLLLLEEEESSSS
- Print and sort the login names of all users:
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 18
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- BBBBEEEEGGGGIIIINNNN {{{{ FFFFSSSS ==== """"::::"""" }}}}
- {{{{ pppprrrriiiinnnntttt $$$$1111 |||| """"ssssoooorrrrtttt"""" }}}}
-
- Count lines in a file:
-
- {{{{ nnnnlllliiiinnnneeeessss++++++++ }}}}
- EEEENNNNDDDD {{{{ pppprrrriiiinnnntttt nnnnlllliiiinnnneeeessss }}}}
-
- Precede each line by its number in the file:
-
- {{{{ pppprrrriiiinnnntttt FFFFNNNNRRRR,,,, $$$$0000 }}}}
-
- Concatenate and line number (a variation on a theme):
-
- {{{{ pppprrrriiiinnnntttt NNNNRRRR,,,, $$$$0000 }}}}
-
- SSSSEEEEEEEE AAAALLLLSSSSOOOO
- _e_g_r_e_p(1), _g_e_t_p_i_d(2), _g_e_t_p_p_i_d(2), _g_e_t_p_g_r_p(2), _g_e_t_u_i_d(2),
- _g_e_t_e_u_i_d(2), _g_e_t_g_i_d(2), _g_e_t_e_g_i_d(2), _g_e_t_g_r_o_u_p_s(2)
-
- _T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, Alfred V. Aho, Brian W. Ker-
- nighan, Peter J. Weinberger, Addison-Wesley, 1988. ISBN 0-
- 201-07981-X.
-
- _T_h_e _G_A_W_K _M_a_n_u_a_l, Edition 0.15, published by the Free
- Software Foundation, 1993.
-
- PPPPOOOOSSSSIIIIXXXX CCCCOOOOMMMMPPPPAAAATTTTIIIIBBBBIIIILLLLIIIITTTTYYYY
- A primary goal for _g_a_w_k is compatibility with the POSIX
- standard, as well as with the latest version of UNIX _a_w_k.
- To this end, _g_a_w_k incorporates the following user visible
- features which are not described in the AWK book, but are
- part of _a_w_k in System V Release 4, and are in the POSIX
- standard.
-
- The ----vvvv option for assigning variables before program execu-
- tion starts is new. The book indicates that command line
- variable assignment happens when _a_w_k would otherwise open
- the argument as a file, which is after the BBBBEEEEGGGGIIIINNNN block is
- executed. However, in earlier implementations, when such an
- assignment appeared before any file names, the assignment
- would happen _b_e_f_o_r_e the BBBBEEEEGGGGIIIINNNN block was run. Applications
- came to depend on this ``feature.'' When _a_w_k was changed to
- match its documentation, this option was added to accommo-
- date applications that depended upon the old behavior.
- (This feature was agreed upon by both the AT&T and GNU
- developers.)
-
- The ----WWWW option for implementation specific features is from
- the POSIX standard.
-
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 19
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- When processing arguments, _g_a_w_k uses the special option ``--------
- '' to signal the end of arguments. In compatibility mode,
- it will warn about, but otherwise ignore, undefined options.
- In normal operation, such arguments are passed on to the AWK
- program for it to process.
-
- The AWK book does not define the return value of ssssrrrraaaannnndddd(((()))).
- The System V Release 4 version of UNIX _a_w_k (and the POSIX
- standard) has it return the seed it was using, to allow
- keeping track of random number sequences. Therefore ssssrrrraaaannnndddd(((())))
- in _g_a_w_k also returns its current seed.
-
- Other new features are: The use of multiple ---- ffff options
- (from MKS _a_w_k); the EEEENNNNVVVVIIIIRRRROOOONNNN array; the \\\\aaaa, and \\\\vvvv escape
- sequences (done originally in _g_a_w_k and fed back into
- AT&T's); the ttttoooolllloooowwwweeeerrrr(((()))) and ttttoooouuuuppppppppeeeerrrr(((()))) built-in functions
- (from AT&T); and the ANSI C conversion specifications in
- pppprrrriiiinnnnttttffff (done first in AT&T's version).
-
- GGGGNNNNUUUU EEEEXXXXTTTTEEEENNNNSSSSIIIIOOOONNNNSSSS
- _G_a_w_k has some extensions to POSIX _a_w_k. They are described
- in this section. All the extensions described here can be
- disabled by invoking _g_a_w_k with the ----WWWW ccccoooommmmppppaaaatttt option.
-
- The following features of _g_a_w_k are not available in POSIX
- _a_w_k.
-
- +o The \\\\xxxx escape sequence.
-
- +o The ssssyyyyssssttttiiiimmmmeeee(((()))) and ssssttttrrrrffffttttiiiimmmmeeee(((()))) functions.
-
- +o The special file names available for I/O redirection
- are not recognized.
-
- +o The AAAARRRRGGGGIIIINNNNDDDD and EEEERRRRRRRRNNNNOOOO variables are not special.
-
- +o The IIIIGGGGNNNNOOOORRRREEEECCCCAAAASSSSEEEE variable and its side-effects are not
- available.
-
- +o The FFFFIIIIEEEELLLLDDDDWWWWIIIIDDDDTTTTHHHHSSSS variable and fixed width field
- splitting.
-
- +o No path search is performed for files named via the
- ---- ffff option. Therefore the AAAAWWWWKKKKPPPPAAAATTTTHHHH environment vari-
- able is not special.
-
- +o The use of nnnneeeexxxxtttt ffffiiiilllleeee to abandon processing of the
- current input file.
-
- +o The use of ddddeeeelllleeeetttteeee _a_r_r_a_y to delete the entire con-
- tents of an array.
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 20
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- The AWK book does not define the return value of the cccclllloooosssseeee(((())))
- function. _G_a_w_k's cccclllloooosssseeee(((()))) returns the value from _f_c_l_o_s_e(3),
- or _p_c_l_o_s_e(3), when closing a file or pipe, respectively.
-
- When _g_a_w_k is invoked with the ----WWWW ccccoooommmmppppaaaatttt option, if the _f_s
- argument to the ----FFFF option is ``t'', then FFFFSSSS will be set to
- the tab character. Since this is a rather ugly special
- case, it is not the default behavior. This behavior also
- does not occur if ----WWWW ppppoooossssiiiixxxx has been specified.
-
- HHHHIIIISSSSTTTTOOOORRRRIIIICCCCAAAALLLL FFFFEEEEAAAATTTTUUUURRRREEEESSSS
- There are two features of historical AWK implementations
- that _g_a_w_k supports. First, it is possible to call the
- lllleeeennnnggggtttthhhh(((()))) built-in function not only with no argument, but
- even without parentheses! Thus,
-
- aaaa ==== lllleeeennnnggggtttthhhh
-
- is the same as either of
-
- aaaa ==== lllleeeennnnggggtttthhhh(((())))
- aaaa ==== lllleeeennnnggggtttthhhh(((($$$$0000))))
-
- This feature is marked as ``deprecated'' in the POSIX stan-
- dard, and _g_a_w_k will issue a warning about its use if ----WWWW lllliiiinnnntttt
- is specified on the command line.
-
- The other feature is the use of the ccccoooonnnnttttiiiinnnnuuuueeee statement out-
- side the body of a wwwwhhhhiiiilllleeee, ffffoooorrrr, or ddddoooo loop. Traditional AWK
- implementations have treated such usage as equivalent to the
- nnnneeeexxxxtttt statement. _G_a_w_k will support this usage if ----WWWW ppppoooossssiiiixxxx
- has not been specified.
-
- EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT VVVVAAAARRRRIIIIAAAABBBBLLLLEEEESSSS
- If PPPPOOOOSSSSIIIIXXXXLLLLYYYY____CCCCOOOORRRRRRRREEEECCCCTTTT exists in the environment, then _g_a_w_k
- behaves exactly as if --------ppppoooossssiiiixxxx had been specified on the com-
- mand line. If --------lllliiiinnnntttt has been specified, _g_a_w_k will issue a
- warning message to this effect.
-
- BBBBUUUUGGGGSSSS
- The ----FFFF option is not necessary given the command line vari-
- able assignment feature; it remains only for backwards com-
- patibility.
-
- If your system actually has support for ////ddddeeeevvvv////ffffdddd and the
- associated ////ddddeeeevvvv////ssssttttddddiiiinnnn, ////ddddeeeevvvv////ssssttttddddoooouuuutttt, and ////ddddeeeevvvv////ssssttttddddeeeerrrrrrrr files,
- you may get different output from _g_a_w_k than you would get on
- a system without those files. When _g_a_w_k interprets these
- files internally, it synchronizes output to the standard
- output with output to ////ddddeeeevvvv////ssssttttddddoooouuuutttt, while on a system with
- those files, the output is actually to different open files.
- Caveat Emptor.
-
-
-
- Free Software FoundationLast change: Apr 18 1994 21
-
-
-
-
-
-
- GAWK(1) Utility Commands GAWK(1)
-
-
-
- VVVVEEEERRRRSSSSIIIIOOOONNNN IIIINNNNFFFFOOOORRRRMMMMAAAATTTTIIIIOOOONNNN
- This man page documents _g_a_w_k, version 2.15.
-
- Starting with the 2.15 version of _g_a_w_k, the ----cccc, ----VVVV, ----CCCC, ---- aaaa,
- and ----eeee options of the 2.11 version are no longer recognized.
- This fact will not even be documented in the manual page for
- version 2.16.
-
- AAAAUUUUTTTTHHHHOOOORRRRSSSS
- The original version of UNIX _a_w_k was designed and imple-
- mented by Alfred Aho, Peter Weinberger, and Brian Kernighan
- of AT&T Bell Labs. Brian Kernighan continues to maintain and
- enhance it.
-
- Paul Rubin and Jay Fenlason, of the Free Software Founda-
- tion, wrote _g_a_w_k, to be compatible with the original version
- of _a_w_k distributed in Seventh Edition UNIX. John Woods con-
- tributed a number of bug fixes. David Trueman, with contri-
- butions from Arnold Robbins, made _g_a_w_k compatible with the
- new version of UNIX _a_w_k.
-
- The initial DOS port was done by Conrad Kwok and Scott Gar-
- finkle. Scott Deifik is the current DOS maintainer. Pat
- Rankin did the port to VMS, and Michal Jaegermann did the
- port to the Atari ST. The port to OS/2 was done by Kai Uwe
- Rommel, with contributions and help from Darrel Hankerson.
-
- AAAACCCCKKKKNNNNOOOOWWWWLLLLEEEEDDDDGGGGEEEEMMMMEEEENNNNTTTTSSSS
- Brian Kernighan of Bell Labs provided valuable assistance
- during testing and debugging. We thank him.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Free Software FoundationLast change: Apr 18 1994 22
-
-
-
-